Speaker Recognition Via Nonlinear Discriminant Features

نویسندگان

  • Lara Stoll
  • Joe Frankel
  • Nikki Mirghafori
چکیده

We use a multi-layer perceptron (MLP) to transform cepstral features into features better suited for speaker recognition. Two types of MLP output targets are considered: phones (Tandem/HATS-MLP) and speakers (Speaker-MLP). In the former case, output activations are used as features in a GMM speaker recognition system, while for the latter, hidden activations are used as features in an SVM system. Using a smaller set of MLP training speakers, chosen through clustering, yields system performance similar to that of a Speaker-MLP trained with many more speakers. For the NIST Speaker Recognition Evaluation 2004, both Tandem/HATS-GMM and Speaker-SVM systems improve upon a basic GMM baseline, but are unable to contribute in a score-level combination with a state-of-the-art GMM system. It may be that the application of normalizations and channel compensation techniques to the current state-ofthe-art GMM has reduced channel mismatch errors to the point that contributions of the MLP systems are no longer additive.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Face Recognition by Cognitive Discriminant Features

Face recognition is still an active pattern analysis topic. Faces have already been treated as objects or textures, but human face recognition system takes a different approach in face recognition. People refer to faces by their most discriminant features. People usually describe faces in sentences like ``She's snub-nosed'' or ``he's got long nose'' or ``he's got round eyes'' and so like. These...

متن کامل

Nonlinear Discriminant Feature Extraction for Robust Text-independent Speaker Recognition

We study a nonlinear discriminant analysis (NLDA) technique that extracts a speaker-discriminant feature set. Our approach is to train a multilayer perceptron (MLP) to maximize the separation between speakers by nonlinearly projecting a large set of acoustic features (e.g., several frames) to a lower-dimensional feature set. The extracted features are optimized to discriminate between speakers ...

متن کامل

Neural networks for nonlinear discriminant analysis in continuous speech recognition

In this paper neural networks for Nonlinear Discrimi nant Analysis in continuous speech recognition are pre sented Multilayer Perceptrons are used to estimate a posteriori probabilities for Hidden Markov Model states which are the optimal discriminant features for the sepa ration of the HMM states The a posteriori probabilities are transformed by a principal component analysis to calcu late the...

متن کامل

MLP Internal Representation as Discriminative Features for Improved Speaker Recognition

Feature projection by non-linear discriminant analysis (NLDA) can substantially increase classification performance. In automatic speech recognition (ASR) the projection provided by the pre-squashed outputs from a one hidden layer multi-layer perceptron (MLP) trained to recognise speech subunits (phonemes) has previously been shown to significantly increase ASR performance. An analogous approac...

متن کامل

Combining features via LDA in speaker recognition

Z.P. Sun & J.S. Mason Department of Electrical & Electronic Engineering, University College of Wales, SWANSEA, SA2 8PP, UK email: [email protected], [email protected] ABSTRACT This paper1 discusses cepstral feature combinations via linear discriminant analysis (LDA) in the context of automatic speaker identi cation (ASI). Two static cepstral features are considered, namely standard MFC...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2007